26,709 research outputs found

    Adaptive text mining: Inferring structure from sequences

    Get PDF
    Text mining is about inferring structure from sequences representing natural language text, and may be defined as the process of analyzing text to extract information that is useful for particular purposes. Although hand-crafted heuristics are a common practical approach for extracting information from text, a general, and generalizable, approach requires adaptive techniques. This paper studies the way in which the adaptive techniques used in text compression can be applied to text mining. It develops several examples: extraction of hierarchical phrase structures from text, identification of keyphrases in documents, locating proper names and quantities of interest in a piece of text, text categorization, word segmentation, acronym extraction, and structure recognition. We conclude that compression forms a sound unifying principle that allows many text mining problems to be tacked adaptively

    Customizing digital library interfaces with Greenstone

    Get PDF
    Digital libraries are organized, focused collections of information. They are focused on a particular topic or theme—and good digital libraries will articulate the principles governing what is included. They are organized to make information accessible in particular, well-defined, ways—and good ones will include a description of how the information is organized (Lesk, 1997). The Greenstone digital library software is intended to help users construct simple collections of information very quickly. Indeed, only a few minutes of the user's time are needed to set up a collection based on a standard design and initiate the building process. Collections may be large—some comprise Gbytes of text; millions of documents. Furthermore, even larger volumes of information may be associated with a collection—typically audio, image, and video, with textual metadata. Once initiated, the mechanical process of building the collection may take from a few moments for a tiny collection to several hours for a multi-Gbyte one—perhaps even a day if it involves many different full-text indexes

    The Development and Usage of the Greenstone Digital Library Software

    Get PDF
    The Greenstone software has helped spread the practical impact of digital library technology throughout the world-particularly in developing countries. This article reviews the project’s origins, usage, and the development of support mechanisms for Greenstone users. We begin with a brief summary of salient aspects of this open source software package and its user population. Next we describe how its international, humanitarian focus arose. We then review the special requirements imposed by the conditions that prevail in developing courtiers. Finally we discuss efforts to establish regional support organizations for Greenstone in India and Africa

    Digital libraries for the developing world

    Get PDF
    Digital libraries (DLs) are the killer app for information technology in developing countries. Priorities here include health, agriculture, nutrition, hygiene, sanitation, and safe drinking water. Computers are not a priority, but simple, reliable access to targeted information meeting these basic needs certainly is. DLs can assist human development by providing a non-commercial mechanism for distributing humanitarian information on topics such as health, agriculture, nutrition, hygiene, sanitation, and water supply. Many other areas, ranging from disaster relief to medical education, from the preservation and propagation of indigenous culture to educational material that addresses specific community problems, also benefit from new methods of information distribution

    Probing the link between residual entropy and viscosity of molecular fluids and model potentials

    Full text link
    This work investigates the link between residual entropy and viscosity based on wide-ranging, highly accurate experimental and simulation data. This link was originally postulated by Rosenfeld in 1977, and it is shown that this scaling results in an approximately monovariate relationship between residual entropy and reduced viscosity for a wide range of molecular fluids (argon, methane, CO2, SF6, refrigerant R-134a (1,1,1,2-tetrafluoroethane), refrigerant R-125 (pentafluoroethane), methanol, and water), and a range of model potentials (hard sphere, inverse power, Lennard-Jones, and Weeks-Chandler-Andersen). While the proposed "universal" correlation of Rosenfeld is shown to be far from universal, when used with the appropriate density scaling for molecular fluids, the viscosity of non-associating molecular fluids can be mapped onto the model potentials. This mapping results in a length scale that is proportional to the cube root of experimentally measureable liquid volume values

    Creating and customizing digital library collections with the Greenstone Librarian Interface

    Get PDF
    The Greenstone digital library software is a comprehensive system for building and distributing digital library collections. It provides a new way of organizing information and publishing it on the Internet. This paper describes how digital library collections can be created and customized with the new Greenstone Librarian Interface. Its basic features allow users to add documents and metadata to collections, create new collections whose structure mirrors existing ones, and build collections and put them in place so for users to view. More advanced users can design and customize new collection structures. At the most advanced level, the Librarian Interface gives expert users interactive access to the full power of Greenstone, which could formerly be tapped only by running Perl scripts manually

    Classification

    Get PDF
    In Classification learning, an algorithm is presented with a set of classified examples or ‘‘instances’’ from which it is expected to infer a way of classifying unseen instances into one of several ‘‘classes’’. Instances have a set of features or ‘‘attributes’’ whose values define that particular instance. Numeric prediction, or ‘‘regression,’’ is a variant of classification learning in which the class attribute is numeric rather than categorical. Classification learning is sometimes called supervised because the method operates under supervision by being provided with the actual outcome for each of the training instances. This contrasts with Data clustering (see entry Data Clustering), where the classes are not given, and with Association learning (see entry Association Learning), which seeks any association – not just one that predicts the class

    Transforming Power Relationships: Leadership, Risk, and Hope. IHS Political Science Series No. 135, May 2013

    Get PDF
    Chronic communal conflicts resemble the prisoner’s dilemma. Both communities prefer peace to war. But neither trusts the other, viewing the other’s gain as its own loss, so potentially shared interests often go unrealized. Achieving positive-sum outcomes from apparently zero-sum struggles requires a kind of riskembracing leadership. To succeed leaders must: a) see power relations as potentially positive-sum; b) strengthen negotiating adversaries instead of weakening them; and c) demonstrate hope for a positive future and take great personal risks to achieve it. Such leadership is exemplified by Nelson Mandela and F.W. de Klerk in the South African democratic transition. To illuminate the strategic dilemmas Mandela and de Klerk faced, we examine the work of Robert Axelrod, Thomas Schelling, and Josep Colomer, who highlight important dimensions of the problem but underplay the role of risk-embracing leadership. Finally we discuss leadership successes and failures in the Northern Ireland settlement and the Israeli-Palestinian conflict
    corecore